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ABSTRACT 

A method detects similar objects in a collection of such objects by modification of 
a previous method in such a way that per-object memory requirements are reduced while 
false detections are avoided approximately as well as in the previous method. The 
modification includes (i) combining k samples of features into s supersamples, the value 
of k being reduced from the corresponding value used in the previous method; (ii) 
recording each supersample to b bits of precision, the value of b being reduced from the 
corresponding value used in the previous method; and (iii) requiring / matching 
supersamples in order to conclude that the two objects are sufficiently similar, the value 
of / being greater than the corresponding value required in the previous method. One 
application of the invention is in association with a web search engine query service to 
determine clusters of query results that are near-duplicate documents. 


